Using Proximity Counts to Identify Similar Samples#
This example demonstrates the use of quantile regression forest (QRF) proximity counts to identify similar samples. In this scenario, we train a QRF to predict individual pixel values on a corrupted dataset. We then retrieve the proximity values for samples in a corrupted test set. For each test sample digit, we visualize it alongside a set of similar (uncorrupted) training samples determined by their proximity counts, as well as the uncorrupted digit. The similar samples are ordered from the highest to the lowest proximity count for each digit, arranged from left to right and top to bottom. This example illustrates the effectiveness of proximity counts in identifying similar samples, even when using noisy training and test data.